A sub-band-based feature reconstruction approach for robust speaker recognition

نویسندگان

Furong Yan

Yanbin Zhang

Jiachang Yan

چکیده

Although the field of automatic speaker or speech recognition has been extensively studied over the past decades, the lack of robustness has remained a major challenge. The missing data technique (MDT) is a promising approach. However, its performance depends on the correlation across frequency bands. This paper presents a new reconstruction method for feature enhancement based on the trait. In this paper, the degree of concentration across frequency bands is measured with principal component analysis (PCA). Through theoretical analysis and experimental results, it is found that the correlation of the feature vector extracted from the sub-band (SB) is much stronger than the ones extracted from the full-band (FB). Thus, rather than dealing with the spectral features as a whole, this paper splits full-band into sub-bands and then individually reconstructs spectral features extracted from each SB based on MDT. At the end, those constructed features from all sub-bands will be recombined to yield the conventional mel-frequency cepstral coefficient (MFCC) for recognition experiments. The 2-sub-band reconstruction approach is evaluated in speaker recognition system. The results show that the proposed approach outperforms full-band reconstruction in terms of recognition performance in all noise conditions. Finally, we particularly discuss the optimal selection of frequency division ways for the recognition task. When FB is divided into much more sub-bands, some of the correlations across frequency channels are lost. Consequently, efficient division ways need to be investigated to perform further recognition performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Noise Robust Speaker Identification Using Sub-Band Weighting in Multi-Band Approach

Recently, many techniques have been proposed to improve speaker identification in noise environments. Among these techniques, we consider the feature recombination technique for the multi-band approach in noise robust speaker identification. The conventional feature recombination technique is very effective in the band-limited noise condition, but in broad-band noise condition, the conventional...

متن کامل

Robust speaker recognition using spectro-temporal autoregressive models

Speaker recognition in noisy environments is challenging when there is a mis-match in the data used for enrollment and verification. In this paper, we propose a robust feature extraction scheme based on spectro-temporal modulation filtering using two-dimensional (2-D) autoregressive (AR) models. The first step is the AR modeling of the sub-band temporal envelopes by the application of the linea...

متن کامل

روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه

Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...

متن کامل

A multi-band approach based on the probabilistic union model and frequency-filtering features for robust speech recognition

Multi-band approach has recently been introduced for recognition of speech corrupted by frequency-localized noise, showing higher robustness than the traditional full-band approach. However, the multi-band approach has been found to be less robust for wide-band noise than the full-band approach. In this paper, we present a multi-band recognition system based on the combination of the probabilis...

متن کامل

Linear transformations in sub-band groups for speech recognition

Linear transforms have been demonstrated to successfully achieve on-line speaker and environmental adaptation for robust recognition. This paper explores the gains in computational speed, speaker adaptation convergence rate and recognition performance obtained through the use of multi-resolution sub-band linear transforms in speech recognition. A useful feature of multiresolution processing is ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

EURASIP J. Audio, Speech and Music Processing

دوره 2014 شماره

صفحات -

تاریخ انتشار 2014

A sub-band-based feature reconstruction approach for robust speaker recognition

نویسندگان

چکیده

منابع مشابه

Noise Robust Speaker Identification Using Sub-Band Weighting in Multi-Band Approach

Robust speaker recognition using spectro-temporal autoregressive models

روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه

A multi-band approach based on the probabilistic union model and frequency-filtering features for robust speech recognition

Linear transformations in sub-band groups for speech recognition

عنوان ژورنال:

اشتراک گذاری